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It is shown how various timing recovery schemes are reasonable 
approximations of the maximum likelihood strategy for estimating an 
unknown timing parameter in additive white gaussian noise. These 
schemes derive an appropriate error signal from the received data 
which is then used in a closed-loop system to change the timing phase 
of a voltage-controlled oscillator. The technique of stochastic approxi- 
mation is utilized to cast the synchronization problem as a regression 
problem and to develop an estimation algorithm which rapidly con^ 
verges to the desired sampling time. This estimate does not depend 
upon knowledge of the system impulse response, is independent of 
the noise distribution, is computed in real time, and can be synthesized 
as a feedback structure. As is characteristic of stochastic approxima- 
tion algorithms, the current estimate is the sum of the previous esti- 
mate and a time-varying weighted approximation of the estimation 
error. The error is approximated by sampling the derivative of the 
received signal, and the mean-square error of the resulting estimate 
is minimized by optimizing the choice of the gain sequence. 

If the receiver is provided with an ideal reference (or if the data 
error rate is small) it is shown that both the bias and the jitter (mean- 
square error) of the estimator approach zero as the number of itera- 
tions becomes large. The rate of convergence of the algorithm ?s 
derived and examples arc provided which indicate that reliable 
synchronization information can be quickly acquired. 

I. INTRODUCTION* 

The problem of symbol synchronization in digital data transmission 
in the presence of intersymbol interference is extremely complicated. 
The best sampling instants are channel dependent and are in general 
difficult to determine. Consequently, the problem of timing recovery in 
high-speed data transmission is intimately tied in with adaptive 
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equalization. Since general methods for simultaneous optimum deter- 
mination of the receiver parameters are not known, these parameters 
are independently determined. 

Timing information is usually obtained directly from the data wave 
in a variety of ways. 1-3 Our objectives in this paper are: 

(?') To indicate the optimum method (maximum likelihood) for 
estimating an unknown timing parameter from random data 
for a certain class of PAM data transmission systems; 

(tt) To show that a variety of timing recovery methods currently 
in use are reasonable approximations of the optimum method, 
and to note that the generation of an error signal from the 
received signal is a feature common to these methods; 

(iii) To demonstrate that timing recovery dynamics can often be 
studied and controlled through the application of stochastic 
approximation theory. 4-6 

Identifying the desired timing parameter as the solution of a re- 
gression equation will allow us to apply stochastic approximation 
theory to the symbol synchronization problem. For purposes of illus- 
tration we analyze a stochastic approximation timing recovery pro- 
cedure for square-wave modulation. For this example we derive 
asymptotic formulas for the probability of error as a function of 
signal-to-noise ratio and the number of iterations used in the timing 
recovery loop. Since the number of iterations is directly proportional 
to the number of signaling intervals, insight is provided into the setup 
time required to achieve reliable symbol synchronization. 

We finally focus on the more difficult problem of timing recovery 
in bandlimited PAM systems. Here timing information must be ob- 
tained in the presence of intersymbol interference as well as additive 
noise. A stochastic approximation algorithm is presented which derives 
symbol synchronization (i.e., estimates the desired sampling time) 
from the received data in a quick and accurate manner. The estima- 
tion algorithm developed does not require explicit knowledge of the 
system impulse response or the noise distribution. If the impulse 
response of the channel satisfies certain conditions, then the algorithm 
will converge in mean-square provided the gain sequence is properly 
chosen. Symbol synchronization is obtained by adjusting the sampling 
time in the following manner: at the end of each symbol interval the 
current estimate is taken to be the sum of the previous estimate and a 
weighted approximation to the actual estimation error. The desired 
sampling time is assumed to be that instant when the system impulse 
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response is a maximum. For this sampling time it is shown that a 
reasonable approximation to the estimation error is the sampled 
derivative of the received signal. 1 When the error is small, its evolu- 
tion can be described by a first-order random difference equation. At 
every iteration the mean-square error (msel can be minimized by 
optimizing the choice of the (time-varying) weighting sequence. The 
optimum weighting sequence is of the form 1/U + /3n), where a and 
/? are quantities which depend on the system impulse response and 
noise power, and n is the discrete time index. Since a and /? are gen- 
erally unknown at the receiver they may either be estimated (giving 
rise to an adaptive synchronization algorithm) or picked arbitrarily. 
In an effort to overcome the lack of knowledge of a and /? (in addition 
to simplifying the algorithm) it is tempting to use the asymptotic form 
of the gain c/n, where c is a constant. However, if /3 <K a then the 
optimum gain is essentially a constant (l/«) for many iterations, and 
for a wide range of c the estimate obtained using c/n is shown to be 
unreliable. Hence it appears that in order to obtain satisfactory per- 
formance some adaptivity to determine « and (3 should be used in any 
realization of the algorithm. 

Under the assumptions that the receiver error rate is small (so that 
an ideal reference can be assumed) and that the "eye" of the dif- 
ferentiated impulse response is open, the optimum mse is asymptotically 
of the form 1/pn, where p is a "signal-to-noise" ratio. The "signal" 
term is the value of the slope of the differentiated impulse response 
near the origin, the "noise" term is the sum of the actual noise vari- 
ance and two intersymbol interference type terms. Thus the mse can 
be driven to zero and an example is given to illustrate how an accurate 
estimate can be obtained in a few signaling intervals. We show that for 
a sin x/x impulse response, ten iterations will drive the mean-square 
error to less than 0.01 of a signaling interval. 

In Section II we determine the maximum likelihood estimate of 
an unknown timing parameter for a baseband PAM data signal which 
has been contaminated by white gaussian noise. Several approxima- 
tions to the optimum estimator are described in Section III. The 
theory of stochastic approximation is introduced in Section IV, and is 
used both to cast the synchronization problem as a regression problem, 



t B. R. Saltzberg 7 has suggested a technique for timing recovery which uses 
this approximation. His investigation is restricted to algorithms which can be 
realized using time-invariant devices. The algorithm we develop exploits the 
advantages of using time-varying elements. 
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and to analyze and control the dynamics of timing recovery. In Sec- 
tion V we discuss a timing recovery algorithm for bandlimited PAM. 

II. THE MAXIMUM LIKELIHOOD ESTIMATOR OF AN UNKNOWN TIMING 
PARAMETER 

Consider the L level data wave in additive white gaussian noise v(t) 
of double-sided spectral density JV , 

V(t) = 2 <*Mt -nT - r*) + v(t), (1) 

where {a,,} are the data symbols taking on values ±d, ±3d, ••• 
±(L — l)d with equal probability, h(t) is a bandlimited pulse whose 
peak value occurs at t*, and — T/2 ^= t* ^ T/2 is an unknown timing 
parameter. 1 

Detection of the data symbols {a„} is usually accomplished by first 
suitably filtering V(t) and then sampling the output at time instants 
t + kT, k = ±1, ±2, • ■ • . The resulting error rate is a function of t 
in addition to other parameters. An ideal timing recovery system 
would supply the detector with t which minimizes the probability of 
error. While this problem is conceptually straightforward, it is not 
analytically tractable and the structure of such an optimum timing 
recovery system is not generally yet known. We therefore must resort 
to a less Utopian criterion. 

Much simpler evaluation functions often used in data transmission 8 
are 

Difr - *) = J h(r j rt) p E I *(r - r' - kT) |< 

j = 1 or 2. (2) 

Even for these relatively simple evaluation functions it is generally 
difficult to find the optimum t. R. W. Chang" derives timing recovery 
procedures based on minimizing a particular version of equation (2). 
However, for a certain class of linear distortions, namely the type that 
gives rise to symmetrical pulse shapes, the best t, which minimizes 
(2), is equal to the unknown parameter r'\ For this class of channels 
the problem of optimal timing recovery procedures can be cast in the 
language of statistical estimation theory. This is the situation treated 
in this section. 



t We assume throughout that t* is independent of time. 
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The statistical problem we pose is this: determine an estimation 
procedure for the parameter t based on observations made on the 
received signal V(t [equation (1)|. The more detailed question we 
wish to answer is the following. How should the observed signal, say 
for T a seconds, be processed such that a "good" estimate of t* is 
obtained? The answer of course depends on what one means by good. 
A reasonable measure of goodness is to require that the estimate 
maximize the likelihood function of the unknown parameter. For 
binary transmission this is a classical problem for which a solution 
is known. (See for example Ref. 3. 10, and 11.)* The extension to 
multilevel signaling is straightforward and we now briefly sketch the 
derivation. The likelihood function of the received signal is propor- 
tional to (superfluous constants are omitted) 



L[V] ~ tfjexp - g— jf T ' [V(t) - s(t; r)f dtj^, 



(••5) 



where s(t; t) = X) a «h(t - nT - t) and E\-\ a denotes expectation 
with respect to the data symbols. The expectation indicated in (2) 
can be carried out provided the reasonable assumption is made that 
the power in the data signal s(t; r) when measured over an interval 
[0, T,] (large compared with a symbol duration) is independent of the 
data sequence and the unknown parameter t. This assumption leads 
to a simplified version of (3) 

Lm~4xp{^ fvUsO;, )*}}. W 



I. -I 



kd 



k o<l«l 



where 



5n ( T ) = f' Y(t)h(t -nT - t) 

Jo 



dt (6) 



is recognized as the sampled (at times nT + t) output of a filter 
matched to h(t), whose input is V{t). 

The maximum likelihood estimate (MLE) is obtained by differ- 
entiating L[V] with respect to t and setting the resulting expression 
to zero. An equivalent strategy may be obtained by differentiating any 
monotonic function of L and a convenient such function in this appli- 



t None of the references cited claims originality. It is difficult to determine 
where (he result was written down first. 
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cation is the logarithmic function. From equation (5) 

A[V] = In L[V) ~ E {in [g cosh (^*-(r))]} , (7a) 



ft odd 

and upon differentiation we obtain 

' (2ft - 1) d 

\6K — L) S11U1 \ 
t-1 



dA 

dr 



= E 



#o 



*.(r) 



L/2 



^ , f (2k - l)d 

E COsh (" r^—zXr) 



d dz n ( T ) 
N dr 



(7b) 



(7c 



ft = l x N 

where the bracketed term can be shown to be f 
(L - 1) sinh (&±M ,, (r) ) _ (L + i) sinh ftLz^J ^ 

and for the typical data communication environment of a large signal- 
to-noise ratio the above expression becomes proportional to 

d-Dt»»h(^ 2 . w ). 

Thus we finally have that 



a 1^ ~T~^ tanh 

or „ (it 



(L + 1) d 

No 



2-(r) 



(8) 



The optimum estimation strategy is exhibited in equation (8) . The best 
value of t (i.e., the MLE) makes the right-hand side of equation (8) 
as small as possible. The mathematical operations exhibited in equa- 
tion (8) can readily be instrumented. The implementation objective 
would be to use the right-hand side of (8) as an error signal in a 
closed-loop system that iteratively adjusts r to determine the MLE. 
A block diagram of this implementation is shown in Fig. 1. The re- 
ceived signal and its derivative are first passed through filters with 
identical impulse responses h( — t) whose outputs are periodically 
sampled at times nT + t. In the undifferentiated branch, the samples 
are first multiplied by (L + l)d/No and are then passed through the 
memoryless nonlinearity tanh ( • ) which resembles an infinite clipper 
for large input values. The output from the two branches are mul- 



t Note that for L = 2, equation (7c) becomes (sinh 3y — 3sinh y)/(cosh 3y — 
cosh y) = tanh y, which agrees with the bracketed terra in (7b). 
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Fig. 1— Implementation of maximum likelihood strategy. 

tiplied and averaged as indicated by the sum in equation (8). This 
then is the error signal driving a voltage-controlled oscillator which 
in turn determines the new timing phase. 

III. IMPLEMENTATIONS APPROXIMATING THE OPTIMUM 

We now examine approximations of equations (7) and (8) leading to 
several simplified implementations of timing recovery systems. The 
first approach is to approximate tanhl.rl in equation (8) by the 
limiter function sgn(.rl. This approximation yields 



, /( L + 1) d ( , 



sgn 2„(t) = sgn a„ , 



(9) 



where a„ is the nth decision, or the estimate of the nth data symbol. 
The approximation (9) is a good one at large signal-to-noise ratio and 
in this case a„ will equal o„ most of the time. When this approximation 
is made, the detection circuit which computes &„ from z„(t) is sep- 
arated from the timing circuit. In the timing branch the received 
signal is first passed through the filter with impulse response h(— t) 
and the output is differentiated or equivalently passed through a high- 
pass filter and then sampled. These samples are multiplied by the sign 
of the respective decisions and summed to form an error signal. The 
multiplication of the respective derivative samples by the sign of the 
decisions is clearly necessary so as to convert all the error samples 
to the same polarity. 1 Figure 2 shows this simplified version of detec- 



tThis is a decision directed estimation procedure. As the timing phase is 
acquired, the decisions become more reliable. 
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Fig. 2 — An implementation approximating the ideal. 

tion and timing recovery circuit. Deriving an error signal from the 
derivative of the received signal is very reasonable and a timing 
circuit based on this idea has been built and analyzed by Salt/.berg. 7 
Another technique suggested from (7) is dubbed "early-late" timing 
recovery. 2,11 The approximations involved here are the following. First 
the derivative of A[T 7 ] is approximated by the difference 

£ {In cosh ((kd/N )z n ( T + A)) - In cosh ((kd/N )z n ( T - A))} 



A«T. 



(10) 



Next the nonlinear function In [cosh (x)] is approximated by |.-r|. This 
again is a good approximation at large signal-to-noise ratio since for 
large |:r|, cosh x -» e ,xl . This implementation is shown in Fig. 3. Here 
two clock pulses separated by 2A sample the received wave after 
appropriate filtering. The respective samples are then full-wave recti- 
fied and substracted from one another. The error signal is formed by 
adding a number of successive differences. It appears that any even 
JVth-law device may be used in place of the In (cosh) nonlinearity in 
equation (7). Successful results for instance were obtained with a 
square-law device. 12 

A feature common to the above timing recovery systems is the gen- 
eration of an error signal from the received signal. The sampling 
instant is then adjusted so as to decrease the magnitude of the error, 
a new error is computed, and the estimation continues in this manner. 



TIMING RECOVERY IN PAM 



1653 



The fewer the number of iterations needed to obtain a reliable esti- 
mate, the better the system. Stochastic approximation is a technique 
which will enable us to study and control the dynamic behavior of 
such iterative estimation algorithms by viewing the synchronization 
problem as a regression problem. 

IV. THE APPLICATION OF STOCHASTIC APPROXIMATION TO SYMBOL 
SYNCHRONIZATION 

4.1 Stochastic Approximation 

We will briefly describe the salient features of stochastic approxima- 
tion, in particular the Robbins-Monro algorithm. Stochastic approxi- 
mation 4-6 is a technique employed to iteratively solve regression 
problems. The method is an extension of the Newton-Raphson tech- 
nique to a random environment, and is especially useful when the 
regression function is unknown. More precisely, suppose z„ is a 
sequence of independent observations of a stationary random process 
and it is desired to find the value of the (non-random) parameter t 
such that the regression equation. 



E[j(z n ; r)] ± m(r) = m. 



(ID 



is satisfied; where E denotes expectation, /( • ) is a given function, and 
m(-) is called the regression function. As mentioned above, m(-) is 
typically unknown, and we desire an algorithm which uses the data to 
sequentially estimate the value of t, say t", which satisfies (11). 
Robbins and Monro have shown that if (111 has a unique solution 
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Fig. 3— Implementation of early-late timing recovery scheme. 
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then the estimate t„ , given by 

T n+ i = r„ + c n [f(z n ; t„) — m a ] n = 1, 2, • • • , 

will converge in mean-square and with probability one to t*, under 
some general conditions 4 on both the observations z n and on the 
positive scalar time-varying weighting sequence c„ . A useful interpre- 
tation of the Robbins-Monro algorithm is that the current estimate is 
the sum of the previous estimate and a weighted correction term, 
where the average (with respect to the observations) correction is 
the error term ?rc-(r n ) — m . Thus the correction term will, on the 
average, give an increment in the correct direction, and the estimate 
will converge. Alternatively, if we regard the correction term as an 
approximation (in a stochastic sense) to an error term, we are re- 
minded of the deterministic error or gradient search type of algorithms. 
The weighting sequence c„ is chosen to converge to zero fast enough 
so as to suppress the correction term as the estimate converges,* but 
slow enough so that large corrections are possible for many iterations 
(frequently c n is of the form 1/n). 

We now cast the synchronization problem as a regression problem, 
and then use the theory of stochastic approximation to develop a 
synchronization algorithm which has desirable dynamic properties. 
From (8) the optimum (maximum likelihood) timing parameter is 
the solution of 

j- r [A(z n ; r)] = 0. 
If we make the identification 

f- [Afe. ; r)] ~ f(z n ; r), (12a) 

or 

and now ask for the value of t which satisfies 



i(r) = e\J- t A(z„ ; t)J = 0, 



m(r) = £|_- A(z n ; t)J = 0, (12b) 

then the desired [i.e., the solution of (12b)] timing parameter will 
be the solution of a regression equation. It is important to note that the 
solutions of (8) and (12b) will not, in general, be the same. However 
the solution of (8) is a random variable, which as the observation 



t Note that even when r„ is close to t*, the variance of the correction term 
can be quite large due to the randomness of the data. 
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time T s becomes large converges to t* ; while the solution 1 of (12b) 
is in fact t*. Thus if we use a Robbins-Monro algorithm to iteratively 
solve (12b) we are indeed generating the maximum likelihood esti- 
mate. 

4.2 Binary Square-Wave Modulation 

Consider applying this method to analyze a timing recovery pro- 
cedure when h(t) in equation (1) is a rectangular pulse of T seconds 
duration and height A, where binary transmission is assumed for 
convenience. In this case, the observable function, equation (6), 
becomes 

2n ( T ) = f ' V(t)h(t -nT - r) dt = [ V(t) dt. (13) 

As mentioned earlier, we can use a square-law device to approximate 
the In cosh (•) nonlinearity for mathematical convenience. Thus the 
MLE is obtained by finding a t such that the derivative of E z 2 „(t) 

is zero. From (7a) and (13) we obtain 

£ E*n(r) = 2 £ [V((n + 1)T + r) - V(nT + r)> n (r). (14) 

At large signal-to-noise ratio, symbol transition information is ob- 
tained from 

d n = V((n + 1)T + r) - V(nT + r) ~ J °' fln+1 * a " = * • (15) 



ld=l, a n+l -a„ = —1 
The Robbins-Monro procedure for recursively estimating t can now 
be applied by using the regression function 

m( T ) = E\d„z n (r)\. (16) 

For convenience we center the pulse h(t) at t = such that 

(4 I /. I < 772 



0, elsewhere 
and calculate 



fe(< - roT) d* + / »(0 d< 

..T+t •'nT+i 



t For a high signal-to-noise ratio. 
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( nnT+T/2 n(n + l)T+T "j 

= d n A\a n \ dt + a n+l I dt\ + v n 

\ JnT+T JnT+T/2 ) 

= d n A{a n (T/2 - r) + a n+} (T/2 + T )} + v n , (17) 



where 

.(n + DT + T 



p(t) dt. 

..T+T 



In the absence of data transitions, (17) is independent of t while when 
transition occurs, i.e., a n # a n+1 , 

d n z n (r) = 2At + v n , -T/2 ^ r ^ F/2. (18) 

Using (13) the recursive procedure for estimating the unknown 
timing parameter, t, is now as follows: Pick an arbitrary sampling 
phase t« , | t„ | ^ T/2, and compute the next sampling phase ti from the 
relation (assuming that a data transition occurs) 

Ti = T - \(doZ {T )) , 1Q x 

= T — \{2A T + V ). 

The (n + l)th sampling phase is then related to the nth by the 
recursion relation 

r n+1 = T n -^~^{d n z n {T n )), (20) 

where we have taken c„ to be 1/n+l. For numerical evaluation pur- 
poses it is convenient to normalize (18) and work with the regression 
function 1 

m(r n ) = E[f(x n , t„)] = E[r n + x n ], (21) 

where {x„} is a sequence of gaussian random variables with 

E\x n \ = (22) 



and 



E{x n \ - 4A2 - gp 



where p = A 2 /2N () 1/T is the signal-to-noise ratio in a bit-rate band- 
width. 



t We are assuming that a linear theory applies, i.e., the sequence of {t,,} 
rarely exceeds \T/2\. In practice no values of t„ which exceeds \T/2\ will he 
accepted. Including these restrictions in the mathematical model will render 
equations (19), (20), and (21) nonlinear and thus mathematically intractable. 
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Upon substituting (21) into (20) a linear recursion relation is 
obtained with the well-known solution 

r n = —^r - -^-r k 'f! ** \r*\* T/2. (23) 

?i-f-l n + 1 j£o 

By inspection the following pertinent parameters are computed 



p. n = E[t„] = —^-r -* 0, as n -» «> 

r 71+1 



and 



^ = ^(r n -E 2 [r n ]} =varr n = ^^- ] y^0 ) as n-»oo. (24) 

In evaluating (24) we assumed that the sequence of random variables 
{.?„} is independent. This is not strictly true. We see from (17) that 
the sequence of random variables {.r„} for fixed t is indeed independent 
since each .r„ represents nonovcrlapping integrals of the white-noise 
process v(t). However, as t„ is changed according to equation (20) the 
noise integrals may overlap. To include this dependence in the analysis 
would render this seemingly simple problem untractable mathe- 
matically. Physically we feel, however, that this dependence is weak 
and therefore can be neglected. 

From (231 we see that t„ possesses a truncated gaussian prob- 
ability density 

P(r n ) = p, S(r n - T/2) + p 2 8(r n + T/2) + G(r n ) | r, | ^ T/2 (2g) 
= I r n | > T/2 



where 



and 



GM = vk°, eKp r^ iT "-"" )2 } 



.-T/2 



/ — tit 
G(r n ) dr n 

p 2 = f G(r n ) dr n . 

J T/2 



Using this probability density we can compute the system error rate. 
Dispensing with tedious computational details, and focusing atten- 
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tion on essentials, we find that the conditional error rate (conditioned 
on the unknown parameter -„) for this simple system is asymptot- 
ically (large signal-to-noise ratio) 

P.(r.) ~ exp - { A ' [T ~ 2 V r - "' } I , I & T/2, (26) 

where 

a 2 = N T. 

When t„ = 0, we have ideal performance, as we should. When t„ = 
±T/2, we have disaster. To obtain the actual error rate we must 
average (26) over the permissable values of t„ . This calculation yields 

/T/2 
P.(? n )G(j n ) dr n . (27) 

■T/2 

The evaluation of (27) is straightforward. In terms of the normalized 
random variable a = t»/T, we express (26) in the form 

P,(a) ~e- a-2|a|) ' \a\ < 1/2. (28) 

In terms of the same normalized variables and the explicit values of 
/x„ and a„ [equation (24) ] we write 



G(a) <~ exp — 4n(a — —J 



(29) 



which is valid when n is large. In writing down (29) we set to = T/2 
(a worst initial guess) . 

Asymptotically, pi and p 2 behave as e~ np and, as we shall see shortly, 
can be neglected compared with the last term in (27). To conclude 
the error rate calculation we evaluate 

f' 1 PXr n )G(r n ) dr n = Up) ~ f 8 -Ki-il.l>--M*-i7W da 

J -T/2 J-\ 

= f e-" E ' {a) da + f e- pE > {a) da, (30) 



where 



and 



E 1 (a) = (1 + 2a) 2 + 4n(a - - 1 



in 



E 2 (a) = (1 - 2a) 2 + 4n(a - ^Y 
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Using a saddle-point technique to obtain an asymptotic approximation 
for the integrals, we find that 

P.(p)~e->"> M +e-> u ' in) , 

where 

MM ~ 1 + \ 

and (3D 

M 2 (n) ~ 1 - |- 

Combining the above asymptotic results with pi and p- 2 we obtain 
finally 

P,~exp{-„(l-f)} (32) 

for n and P large. All the other terms have exponents larger than (32) 
and therefore can be neglected. For example when n = 30, the degrada- 
tion from ideal (n-»oo) is only 0.5 dB approximately. 

What this example shows is that for square-wave modulations in 
the presence of additive white gaussian noise, bit timing can reliably 
be derived in approximately 30-bit intervals. 

4.3 Synchronization of Bandlimited PAM 

We now consider a timing recovery algorithm for a bandlimited 
PAM signal. As in the previous section the synchronization problem 
will be cast as a regression problem. Our received signal is given by 
equation (1) 

V(t) = L aji(t - mT - r*) + r(0, (33) 

and as before the objective of the synchronizer is to accurately and 
rapidly estimate t*. In order to extract information about r* we low- 
pass filter, differentiate, and sample the received signal. Hence the 
error signal is similar to that shown in Fig. 2, with the matched 
filter replaced by a low-pass filter. Thus the receiver does not need 
knowledge of the pulse hit). If we denote the derivative of /((•) by 
g(-), then the differentiated and sampled received signal is given by 

V'(kT + r) = £ a m g[(k - m)T + r - r*] + v{kT +r) 

= Z a m g k Ur - r*) + v k , (34) 
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where t is an arbitrary sampling time such that \t\ < T/2, g k - m de- 
notes g((k — m)T), and v k are samples 1 of the differentiated noise 
process v(t). As before we let d fc denote the decision made at time 
kT + t. Assuming that the error rate is low enough so that with high 
probability d k = a k , we then have that 

d k V'(kT + r) = d k a k g (r - r*) + a k £ Wu-Jj - r*) + v k 

= a 2 k g( T — t*) + d k zL a m g k - m (T — t*) + v k , (35) 

myt k 

where we have noted that g (r — t") = g(r — t*). If we further 
assume that d k is uncorrelated with cij ,* for ; ^ fc, then averaging 
(35) gives 

m{r) ^ ^ t 7'(fcT + t)] = 7^(t - r*), (36) 

where 

Now for the typical impulse response h(t) and its derivative g(t), 
shown in Figs. 4 and 5, respectively, it is true that the (regression) 
equation 

g(r - r*) =0, |r - r*\ ^ T/2 (37) 

has the unique solution 

r = T*. (38) 

Since the synchronization problem has been modeled as a regression 
problem, we again use a Robbins-Monro algorithm to sequentially 
estimate t*. Denoting the feth estimate by r k , we have the modified 
Robbins-Monro algorithm 

(r, + c k [d k V\kT + r,)L I n + c k [d k V'(kT + r k )] | < T/2 

[ T k , otherwise. 

(39) 

A feedback implementation of the above algorithm is shown in Fig. 6, 
with D denoting a delay. It is again noted that the algorithm con- 



t The dependence of the noise sample on the sampling offset t is not shown, 
since it is assumed that the noise is stationary. 

* As it will be if the «*'s are independent and the receiver is supplied with 
an ideal reference. 
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Fig. 4 — A typical impulse response h(t). 

strains the estimate to a region of width T. This is consistent with 
the observation that any actual sampling instant will always be 
within T/2 seconds of the desired instant t*, i.e., we may "slip" T 
seconds but this is immaterial as far as estimating t* is concerned. It 
is by no means clear, a priori, that the above algorithm will converge 
rapidly or will converge at all. In fact the rest of this paper will con- 
sider the conditions which must be satisfied for the above algorithm 
to converge and the resulting rate of convergence. 

V. ANALYSIS OF THE SYNCHRONIZATION ALGORITHM 

5.1 The Error Equation 

In order to evaluate the proposed synchronization algorithm we 
will derive a difference equation for the mean-square estimation error 
el, where 



e k = t* - t 



(40) 




Fig. 5— The derivative of h(t). 



1662 



THE BELL SYSTEM TECHNICAL JOURNAL, MAY-JUNE 1971 









d 

dt 








H 


A 


VCt) 


LPF 




t= KT + T k 


(2 




















































TIME-VARYING 
GAIN Ck 




T K+1 


D 


J 


\ 






V 















Fig. 6 — A realization of the synchronization algorithm. 

and the overbar denotes expectation. 1 In order to do this we see that 
from (39), and neglecting for the moment the constraining portion of 
the algorithm, we have 

e k+1 = n +1 - t* = r, - t* + c k [d k V'(hT + n )] 

= r k - t* + c k [g(r k - r*) + d k 2 a m g k . m {r k - t*) + v k ] 



= e k + c k [g(e k ) + d k 2 a«.0*-«(e*) + "*]• 



(41) 



We note that j? ( • ) is such that, on the average, the error is decreased 
at each iteration, and once the estimation error is small* we need only 
keep first-order terms in a Taylor Series expansion of gj^m( e k) about 
(fc - m)T, i.e., 



g k - m {e k ) « g k - m + g' k - m e k , 



(42) 



where g k . m denotes the derivative of g(-) evaluated at (k — m)T. 
Combining (41) and (42) yields the (approximate) first-order stochastic 
difference equation for the evolution of the error, 



e k 



-i = [1 + g' c k + c k d k 2 a m g' k - m ]e k + c*4[S a»fft-» + n]. (43) 



Before studying the behavior of (43) we introduce the following 



t We use the mean-square estimation error as a measure of performance. This 
is because the estimate is a nonlinear one, and thus the probability of error can- 
not be computed. 

t Under this assumption we can certainly neglect the possibility that t*+i = r k . 
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notation : 



g'o = -a 



(44a) 

& - cM + d k £ fl^W ( 44b ) 

7* = 1 + /?. ( 44 c) 

Q* = «, £ Omff*-. + "* , (44d) 

and using the above we rewrite (431 as 

e* + , = y k e k + c k Q k . (45) 

Thus the error obeys a stochastic difference equation where the gain 
(y A .) and the driving term (Q k ) are correlated. It is important to note 
that for the system described by (45) the probability density of the 
present error e k does not depend solely on a finite number of past data 
symbols, a k , but depends on all past and future values. This renders 
impossible an exact analysis of the mean-square error. However, if we 
assume that both y k and Q k are independent sequences, then e k depends 
solely on past y k and Q k , and we can obtain a bound on the mean- 
square error/ Squaring and averaging both sides of (45) gives 

E[el +i ] = Ebtel] + 2c k E[y k e k Q k ] + c 2 k E[Ql]. (46) 

We now proceed to bound each of the terms on the right-hand side of 
(46) . If we assume that the "eye" of the twice-differentiated impulse 
response is open, i.e., 

a > £ | gL I, (47) 

then 

7 , = 1 - c k (a - d, £ a m g' k - m ) ^ 1 - c,(a - £ | g* m |) (48a) 

= 1 - c,fi, (48b) 

where /3 denotes a - £ m * \(jm\- Using the above assumption, and the 
boundedness of the error, we have that 

\E[y k Q k e k ]\ = \E[e k ]E[y k Q k }\ ^ T/2 \E[y k Q k }\, (49) 



t Despite much effort, we have been unable to proceed without this assumption, 
but since the results which follow arc intuitively satisfying and provide insight 
into this difficult problem they have been included in the paper. 
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and duo to the independence of the data bits 
EfokQk] = E[((l - c k )a + c k d k 2 a m g' k - m )d k {^ o<ft-< + v k )] 

= c k 2 g' m g m = c k 2/TG, (50) 

where 1 G denotes T/2 ^ m * g£g m . Finally we have 

E[Ql] = <r 2 + E A = ^ + P, (51) 

where P denotes ^„,^n <7m • Letting 

A, = E[el], (52) 

and combining (46) -(52) we have the iterative bound 

A* +1 g (1 - 0ft)"Ai + c^f (53) 

on the mean-square error, where M is the sum of G and o- 2 + P. 
Although several assumptions have been made in obtaining (53) it is 
believed that the effect of the salient quantities upon the synchroniza- 
tion algorithm have been preserved. We now proceed to find the gain 
sequence which minimizes the bound of (53). 

5.2 The Optimum Gain Sequence 

We now find the sequence of gains, c%, which minimize the right- 
hand side (RHS) of (53) for fixed A* . Since we minimize a bound on 
the mean-square error at every iteration, this is a min-max procedure. 
We first find the optimum gain sequence in terms of A* , and then by 
simultaneously iterating this equation and the bound of (53) we show 
that c* k is proportional to 1/k for large k. We begin by setting to zero 
the derivative of the RHS of (53) with respect to c k , i.e., 

-0(1 - 0c t )A* + Mc k = 

or 

* ■ ftW ™ 

Using (54) in (53) we have 

<f*. (55) 



t It should be noted that if h(t) is an even function of time (with respect to 
the origin), then g(l) and g'{t) will be respectively odd and even time functions 
and G will be zero. 
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c? +' - < cf . (56) 

(1 - Pti+i) ~ 

Now if 

(1 - 0cf +1 ) ^ (57) 

then we have the relation 

n* 



C?+i ^ 



1 + /3c A * ' 
which can be iterated to give 1 

c* 



1 + (Sc*k 



(58) 



(59a) 



where 



P* ° (59b) 

M + /3' 2 A (fc + 1) ' 



c* = ,J.\,. . (60) 



M + /3 2 A ' 



and A, is the initial error variance. Henceforth we will interpret the 
sequence c% specified by (59) and (60) with the inequality replaced 
by an equality, as the optimum gain sequence. Combining (55), (59) 
and (60) we see that the mean-square error is bounded by 



A * < MA : (61a) 



which for large k becomes 

A , s «.\. (61b) 

Thus we see that asymptotically the minimized mean-square error 
is bounded by a term which decays as 1/fc, and is inversely proportional 
to signal-to-noise type ratio ififM). 

The optimum gain, as given by (59b), depends upon the parameters 
A , M, and /3. Since these quantities are generally unknown it is tempting 
to" replace c* k by its asymptotic (large k) value l//3(fc + D- Caution 
must be exercised in making this approximation; since M » /3 A 

tNotc that pc k * g /3c..*/(l + pc*k) g 1 ? thus satisfying (57). 
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implies that the optimum gain sequence is essentially constant for 
many iterations, substitution of a decaying sequence could lead to an 
unreliable estimate (we will consider this point in Section 5.4). How- 
ever if /3 2 A„ ^> M, then c% ~ l//3(fc + 1) and we have only one unknown 
parameter. A possibility is to replace /3 by an estimate — techniques 
of this sort are called adaptive estimation procedures. We now sketch 
a particular adaptive scheme. 

5.3 An Adaptive Synchronization Algorithm 

We now give a method for recursively estimating fi, which can then 
he incorporated in an adaptive synchronization scheme. Since 

= a - El 0*1, 

we desire a function of the received data which has /? as its average 
value. We note that from (34) we have 

E[d k V"(kT + r)] = (/(t - t*) « -a (62a) 

(where the approximation is for small t — r*), and 

S[diV"(kT + r)] = (/^(t - r*) « gL> . (62b) 

We can then estimate /? by using a recursive stochastic approxi- 
mation algorithm of the type discussed in Section 4.1. Such a scheme 
would twice differentiate the incoming data and then multiply the 
data sample by as many of the previous decisions as there are signif- 
icant nonzero samples in the impulse response. Since even an approxi- 
mate analysis of the above algorithm is hopelessly complex, we will 
consider the effect of using a gain of the form c/k, where c is a con- 
stant to be chosen. 

5.4 A Suboptimum Gain 

We consider the mean-square error, as given by (53), with c, ; = c/k. 
This gain is chosen since the optimum gain is asymptotically of this 
form. Care must be taken in choosing c, since the mean-square error 
will be shown to be a sensitive function of this parameter. Iterating 
(53) gives 

A. +1 ^ II (1 - jScO'A. + E II (1 - fa)*£M. (63) 



t A condition one would expect to be satisfied in practice. 
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The inequality 

1 - x £ e~ z (64) 

gives 

fl (1 - 0c,) 2 ^ exp (-2 2 /3c/) ! ( 65 ) 

i-t+i x i=i+] ' 

and noting that 



results in 



na-^HT- 



(66) 



We can see that the transient behavior of the mean-square error, 
which is specified by the first term on the RHS of (63), will be of the 
form (l/fc) 2/?c . The other component of the mean-square error will 
be (approximately) bounded by 



fetors ££(*+«•*-' 






il/c 2 



(20c - 1)(1 + fc) ' 



(67a) 



which results in 



*•♦■ * (if *• + ran (67b) 

as a bound on the mean-square error. If 20c > 1, then for large fc the 
above bound becomes 

*" S ( gfc-WH-*) (67C) 

and the mean-square error will converge at the optimum rate (l/k). It 
is seen that care must be taken in selecting c, since for c s? 1/2/? 
(i.e.. for 2(3c > 1 1 the quantity Mc-/2fic — 1 has a minimum 1 at c = 

t With c = 1/p, A* ^ M/0 2 1/fc which is the optimum asymptotic rate of 
convergence. 
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1/(3, and is infinite at both c = 1/2/3 and c = oo. Thus a very small 
step size (c <C 1/2/3) will result in an mse which converges at a less 
than optimum rate, while large step sizes (c S> 1/2/3) will result in a 
mean-square error which, while converging at the optimum rate, may 
be quite large for many iterations. The sensitivity of the above bound 
with respect to "c" may make the use of an adaptive procedure 
(which estimates /?) advisable. 

5.5 An Example 

Consider the (minimum bandwidth) pulse 

,/ rt A sin irWt , . 

m = A ~^wT (68) 

where W = \/T. It is easy to show that 

(3 = \A{irWf 

M = <r 2 + j A 2 W 2 ; 

thus from (61b) the percentage minimized mean-square error is 
bounded by 



\a) \wW ) 



t 2 < TYk ~ y 2 k ' {W) 

For a 30 dB signal-to-noise (A/a) ratio, and with W - 3000 Hz, wc 
see that A A ,/T 2 is less than 0.01 for k ^ 10. In other words, after 10 
symbols have been received, the above synchronization algorithm re- 
duces the mean-square error to less than 1/100 of a symbol interval. 
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